{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 编码器—解码器（encoder-decoder）\n",
    "\n",
    "在基于词语的语言模型中，我们使用了循环神经网络。它的输入是一段不定长的序列，输出却是定长的，例如一个词语。然而，很多问题的输出也是不定长的序列。以机器翻译为例，输入是可以是英语的一段话，输出可以是法语的一段话，输入和输出皆不定长，例如\n",
    "\n",
    "    英语：They are watching.\n",
    "\n",
    "    法语：Ils regardent.\n",
    "\n",
    "\n",
    "\n",
    "当输入输出都是不定长序列时，我们可以使用编码器—解码器（encoder-decoder）或者seq2seq\n",
    "\n",
    "编码器（encoder）对应输入序列，解码器（decoder）对应输出序列。下面我们来介绍编码器—解码器的设计。\n",
    "\n",
    "![encoder-decoder](./images/encoder_decoder.png)\n",
    "\n",
    "编码器的作用是把一个不定长的输入序列转化成一个定长的 Context 向量 $(\\mathbf{c})$ 。该向量包含了输入序列的信息。\n",
    "\n",
    "解码器的作用是接收一个定长的 Context 向量 $(\\mathbf{c})$ 。然后解读该向量得到一个不定长的输出序列。\n",
    "\n",
    "\n",
    "常用的编码器、解码器是循环神经网络，但是需要注意的是：这时候的循环神经网络通常会是多层的。\n",
    "\n",
    "\n",
    "## 编码器（Encoder）\n",
    "\n",
    "为了演示Encoder，我们构建一个比较简单的编码器 Encoder，一个 Embedding 层和一个GRU层 （这里我们不使用已经训练好的词向量，而是直接使用 nn.Embedding，在实际项目中最好使用 Glove 或者 word2vec）\n",
    "\n",
    "![encoder-network](./images/encoder-network.png)\n",
    "\n",
    "\n",
    "需要注意的是，在我们演示的模型中，我们选择 encoder 最后一个 hidden layer 的输出作为 encoder 的输出，这个输出有时候也被称为 *context vector* 因为，它可以被看作是将整个序列的信息进行编码而得到的。\n",
    "\n",
    "\n",
    "## 解码器（Decoder）\n",
    "\n",
    "为了演示Decoder，我们构建一个比较简单的编码器 Decoder，也是一个 Embedding 层和一个GRU层\n",
    "\n",
    "![decoder-network](./images/decoder-network.png)\n",
    "\n",
    "\n",
    "ps: 上述图像来自 http://pytorch.org/tutorials/intermediate/seq2seq_translation_tutorial.html\n",
    "\n",
    "\n",
    "## 大纲\n",
    "\n",
    "为了实现一个 Neural Translation Machine 我们需要按照以下步骤行动：\n",
    "\n",
    "1、构建一个Config类，用于保存各种超参数，以及导入各种包\n",
    "\n",
    "2、数据预处理\n",
    "\n",
    "3、构建编码器\n",
    "\n",
    "4、构建解码器\n",
    "\n",
    "5、随机采样，对模型进行测试\n",
    "\n",
    "对不起，最后的结果... 宛若一个智障..."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 第一步：构建一个Config类，用于保存各种超参数，以及导入各种包"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import torch\n",
    "import torch.nn as nn\n",
    "from torch.autograd import Variable\n",
    "from torch import optim\n",
    "import torch.nn.functional as F\n",
    "import unicodedata, string, re, random, time, math"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "class Config():\n",
    "    def __init__(self):\n",
    "        self.data_path = \"../data/cmn-eng/cmn.txt\" # 数据放在 /data 目录下\n",
    "        self.use_gpu = True\n",
    "        self.hidden_size = 128\n",
    "        self.encoder_lr = 5*1e-5\n",
    "        self.decoder_lr = 5*1e-5\n",
    "        self.train_num = 100000 # 训练数据集的数目\n",
    "        self.print_epoch = 10000\n",
    "        self.MAX_Len = 15\n",
    "config = Config()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 第二步：数据预处理\n",
    "\n",
    "准备数据的全部过程如下所示：\n",
    "\n",
    "1. 读取txt文件，并按行分割，再把每一行分割成一个pair (Eng, Chinese)\n",
    "2. 过滤并处理文本信息\n",
    "3. 从每个pair中，制作出 中文词典 和 英文词典\n",
    "4. 构建训练集\n",
    "\n",
    "data下载地址为： http://www.manythings.org/anki/cmn-eng.zip\n",
    "\n",
    "该数据集中还有其他类型的翻译数据 http://www.manythings.org/anki/\n",
    "\n",
    "\n",
    "——————————————————————\n",
    "\n",
    "**这里需要注意，当我们下载完成之后，我们要把数据放在主目录下的 /data 文件夹下**\n",
    "\n",
    "格式：/data/cmn-eng/cmn.txt\n",
    "\n",
    "\n",
    "——————————————————————\n",
    "\n",
    "\n",
    "中文词典和英文词典，我们使用*Lang* 类，该类包含了所有的 中文（英文） -> 数字 或者 数字 -> 中文（英文）的映射。\n",
    "\n",
    "同时，我们要给一句话的其实和结束加上标志符\n",
    "\n",
    "起始符：(Start Of Sentence)\n",
    "\n",
    "SOS_token = 0\n",
    "\n",
    "结束符：(End Of Sentence)\n",
    "\n",
    "EOS_token = 1\n",
    "\n",
    "另外，在这个类中，我们需要添加一个 *word2count* 方法，用来计算各个词出现的次数\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "SOS_token = 0\n",
    "EOS_token = 1\n",
    "\n",
    "class Lang():\n",
    "    def __init__(self, name):\n",
    "        self.name = name\n",
    "        self.word2index = {}\n",
    "        self.index2word = {0: \"SOS\", 1: \"EOS\"}\n",
    "        self.word2count = {}\n",
    "        self.n_words = 2  # Count SOS and EOS\n",
    "    \n",
    "    def addSentence(self, sentence):\n",
    "        if self.name == \"Chinese\":\n",
    "            for word in sentence:\n",
    "                self.addWord(word)\n",
    "        else:\n",
    "            for word in sentence.split(' '):\n",
    "                self.addWord(word)\n",
    "    \n",
    "    def addWord(self, word):\n",
    "        if word not in self.word2index:\n",
    "            self.word2index[word] = self.n_words\n",
    "            self.word2count[word] = 1\n",
    "            self.index2word[self.n_words] = word\n",
    "            self.n_words += 1\n",
    "        else:\n",
    "            self.word2count[word] += 1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def unicodeToAscii(s):\n",
    "    return ''.join(\n",
    "        c for c in unicodedata.normalize('NFD', s)\n",
    "        if unicodedata.category(c) != 'Mn'\n",
    "    )\n",
    "\n",
    "# Lowercase, trim, and remove non-letter characters\n",
    "def normalizeString(s):\n",
    "    s = unicodeToAscii(s.lower().strip())\n",
    "    s = re.sub(r\"([.!?])\", r\" \\1\", s)\n",
    "    s = re.sub(r\"[^a-zA-Z.!?]+\", r\" \", s)\n",
    "    return s"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def readLangs(lang1, lang2, pairs_file, reverse=False):\n",
    "    print(\"Reading lines...\")\n",
    "\n",
    "    # Read the file and split into lines\n",
    "    lines = open(pairs_file, encoding='utf-8').read().strip().split('\\n')\n",
    "    # Split every line into pairs and normalize\n",
    "    pairs = []\n",
    "    for l in lines:\n",
    "        temp = l.split('\\t')\n",
    "        eng_unit = normalizeString(temp[0])\n",
    "        chinese_unit = temp[1]\n",
    "        pairs.append([eng_unit, chinese_unit])\n",
    "    \n",
    "    # Reverse pairs, make Lang instances\n",
    "    if reverse:\n",
    "        pairs = [list(reversed(p)) for p in pairs]\n",
    "        input_lang = Lang(lang2)\n",
    "        output_lang = Lang(lang1)\n",
    "    else:\n",
    "        input_lang = Lang(lang1)\n",
    "        output_lang = Lang(lang2)\n",
    "        \n",
    "    return input_lang, output_lang, pairs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "MAX_LENGTH = config.MAX_Len  # 长度大于15的我们统统舍弃\n",
    "\n",
    "eng_prefixes = (\n",
    "    \"i am \", \"i m \",\n",
    "    \"he is\", \"he s \",\n",
    "    \"she is\", \"she s\",\n",
    "    \"you are\", \"you re \",\n",
    "    \"we are\", \"we re \",\n",
    "    \"they are\", \"they re \",\n",
    "    \"i\", \"he\", 'you', 'she', 'we',\n",
    "    'they', 'it'\n",
    ")\n",
    "\n",
    "def filterPair(p):\n",
    "    return len(p[0].split(' ')) < MAX_LENGTH and \\\n",
    "        len(p[1]) < MAX_LENGTH and \\\n",
    "        p[0].startswith(eng_prefixes)\n",
    "\n",
    "def filterPairs(pairs):\n",
    "    return [pair for pair in pairs if filterPair(pair)]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Reading lines...\n",
      "Read 19777 sentence pairs\n",
      "Trimmed to 9473 sentence pairs\n",
      "Counting words...\n",
      "Counted words:\n",
      "Eng 字典的大小为 3737\n",
      "Chinese 字典的大小为 2638\n",
      "['i paid ten dollars for this cap .', '我付了十美元買這頂帽子。']\n"
     ]
    }
   ],
   "source": [
    "def prepareData(lang1, lang2, pairs_file, reverse=False):\n",
    "    input_lang, output_lang, pairs = readLangs(lang1, lang2, pairs_file, reverse)\n",
    "    print(\"Read %s sentence pairs\" % len(pairs))\n",
    "    pairs = filterPairs(pairs)\n",
    "    print(\"Trimmed to %s sentence pairs\" % len(pairs))\n",
    "    print(\"Counting words...\")\n",
    "    for pair in pairs:\n",
    "        input_lang.addSentence(pair[0])\n",
    "        output_lang.addSentence(pair[1])\n",
    "    print(\"Counted words:\")\n",
    "    print(input_lang.name, \"字典的大小为\", str(input_lang.n_words))\n",
    "    print(output_lang.name, \"字典的大小为\", str(output_lang.n_words))\n",
    "    return input_lang, output_lang, pairs\n",
    "\n",
    "input_lang, output_lang, pairs = prepareData('Eng', 'Chinese', config.data_path)\n",
    "print(random.choice(pairs))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**到目前为止，我们已经把字典构建好了，接下来就是构建训练集**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def indexesFromSentence(lang, sentence):\n",
    "    if lang.name == \"Chinese\":\n",
    "        return [lang.word2index[word] for word in sentence]\n",
    "    else:\n",
    "        return [lang.word2index[word] for word in sentence.split(' ')]\n",
    "\n",
    "def variableFromSentence(lang, sentence, use_gpu):\n",
    "    indexes = indexesFromSentence(lang, sentence)\n",
    "    indexes.append(EOS_token)\n",
    "    result = Variable(torch.LongTensor(indexes).view(-1, 1)) # seq*1\n",
    "    if use_gpu:\n",
    "        return result.cuda()\n",
    "    else:\n",
    "        return result\n",
    "\n",
    "def variablesFromPair(pair, use_gpu):\n",
    "    input_variable = variableFromSentence(input_lang, pair[0], use_gpu)\n",
    "    target_variable = variableFromSentence(output_lang, pair[1], use_gpu)\n",
    "    return (input_variable, target_variable)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[(Variable containing:\n",
      "    4\n",
      "  655\n",
      "  109\n",
      " 1476\n",
      "  555\n",
      " 2318\n",
      "  605\n",
      "    6\n",
      "    1\n",
      "[torch.cuda.LongTensor of size 9x1 (GPU 0)]\n",
      ", Variable containing:\n",
      "    6\n",
      "  772\n",
      "  790\n",
      "   78\n",
      "   68\n",
      "   16\n",
      "  173\n",
      "   45\n",
      "  190\n",
      "  734\n",
      " 1055\n",
      " 1242\n",
      "    4\n",
      "    1\n",
      "[torch.cuda.LongTensor of size 14x1 (GPU 0)]\n",
      "), (Variable containing:\n",
      "    4\n",
      "  111\n",
      "  109\n",
      "  434\n",
      "  352\n",
      "    6\n",
      "    1\n",
      "[torch.cuda.LongTensor of size 7x1 (GPU 0)]\n",
      ", Variable containing:\n",
      "    6\n",
      "   67\n",
      "  166\n",
      "  500\n",
      " 1060\n",
      "    4\n",
      "    1\n",
      "[torch.cuda.LongTensor of size 7x1 (GPU 0)]\n",
      ")]\n"
     ]
    }
   ],
   "source": [
    "# 随机获取2个训练数据集， 这里我们依旧不用进行 batch 处理，下一章节 attention 机制中，我们再进行 batch 处理\n",
    "example_pairs = [variablesFromPair(random.choice(pairs), config.use_gpu)\n",
    "                      for i in range(2)]\n",
    "print(example_pairs)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 第三步：构建编码器\n",
    "\n",
    "编码器的结构，如图所示：\n",
    "\n",
    "\n",
    "![encoder-network](./images/encoder-network.png)\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "class Encoder(nn.Module):\n",
    "    def __init__(self, input_size, hidden_size):\n",
    "        super(Encoder, self).__init__()\n",
    "        self.hidden_size = hidden_size\n",
    "        self.embedding = nn.Embedding(input_size, hidden_size)\n",
    "        self.gru = nn.GRU(hidden_size, hidden_size, batch_first=True)\n",
    "    \n",
    "    def forward(self, x, hidden):\n",
    "        embedded = self.embedding(x).view(1, x.size()[0], -1)\n",
    "        output = embedded  # batch*seq*feature\n",
    "        output, hidden = self.gru(output, hidden)\n",
    "        return output, hidden\n",
    "    \n",
    "    def initHidden(self, use_gpu):\n",
    "        result = Variable(torch.zeros(1, 1, self.hidden_size))\n",
    "        if use_gpu:\n",
    "            return result.cuda()\n",
    "        else:\n",
    "            return result"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 第四步：构建解码器\n",
    "\n",
    "编码器的结构，如图所示：\n",
    "\n",
    "![decoder-network](./images/decoder-network.png)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "class Decoder(nn.Module):\n",
    "    def __init__(self, hidden_size, output_size):\n",
    "        super(Decoder, self).__init__()\n",
    "        self.hidden_size = hidden_size\n",
    "\n",
    "        self.embedding = nn.Embedding(output_size, hidden_size)\n",
    "        self.gru = nn.GRU(hidden_size, hidden_size, batch_first=True)\n",
    "        self.out = nn.Linear(hidden_size, output_size)\n",
    "        self.softmax = nn.LogSoftmax()\n",
    "\n",
    "    def forward(self, x, hidden):\n",
    "        output = self.embedding(x).view(1, 1, -1)\n",
    "        output = F.relu(output)\n",
    "        output, hidden = self.gru(output, hidden)\n",
    "        output = self.softmax(self.out(output[0]))\n",
    "        return output, hidden\n",
    "\n",
    "    def initHidden(self, use_gpu):\n",
    "        result = Variable(torch.zeros(1, 1, self.hidden_size))\n",
    "        if use_gpu:\n",
    "            return result.cuda()\n",
    "        else:\n",
    "            return result"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 第五步：开始训练\n",
    "\n",
    "定义优化器、损失函数，然后开始进行训练\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# 实例化模型\n",
    "\n",
    "encoder = Encoder(input_lang.n_words, config.hidden_size)\n",
    "encoder = encoder.cuda() if config.use_gpu else encoder\n",
    "\n",
    "decoder = Decoder(config.hidden_size, input_lang.n_words)\n",
    "decoder = decoder.cuda() if config.use_gpu else decoder\n",
    "\n",
    "# 定义优化器\n",
    "\n",
    "encoder_optimizer = optim.Adam(encoder.parameters(), lr=config.encoder_lr)\n",
    "\n",
    "decoder_optimizer = optim.Adam(decoder.parameters(), lr=config.decoder_lr)\n",
    "\n",
    "\n",
    "# 定义损失函数\n",
    "\n",
    "fn_loss = nn.NLLLoss()\n",
    "\n",
    "training_pairs = [variablesFromPair(random.choice(pairs), config.use_gpu)\n",
    "                      for i in range(config.train_num)]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "loss is: 4.5816\n",
      "loss is: 4.2383\n",
      "loss is: 4.3779\n",
      "loss is: 5.1212\n",
      "loss is: 5.6612\n",
      "loss is: 2.5274\n",
      "loss is: 4.0733\n",
      "loss is: 2.4672\n",
      "loss is: 3.5207\n",
      "loss is: 1.3444\n"
     ]
    }
   ],
   "source": [
    "# 开始训练\n",
    "for iter in range(1, config.train_num+1):\n",
    "    training_pair = training_pairs[iter - 1]\n",
    "    input_variable = training_pair[0]  # seq_len * 1\n",
    "    target_variable = training_pair[1]  # seq_len * 1\n",
    "    \n",
    "    loss = 0\n",
    "    \n",
    "    # 训练过程\n",
    "    encoder_hidden = encoder.initHidden(config.use_gpu)\n",
    "    encoder_optimizer.zero_grad()\n",
    "    decoder_optimizer.zero_grad()\n",
    "\n",
    "    input_length = input_variable.size()[0]\n",
    "    target_length = target_variable.size()[0]\n",
    "    \n",
    "    # 传入 encoder\n",
    "    encoder_output, encoder_hidden = encoder(input_variable, encoder_hidden)\n",
    "    \n",
    "    # decoder 起始\n",
    "    decoder_input = Variable(torch.LongTensor([[SOS_token]]))\n",
    "    decoder_input = decoder_input.cuda() if config.use_gpu else decoder_input\n",
    "    \n",
    "    decoder_hidden = encoder_hidden\n",
    "    \n",
    "    for di in range(target_length):\n",
    "        decoder_output, decoder_hidden = decoder(decoder_input, decoder_hidden)          \n",
    "        targ = target_variable[di]\n",
    "        loss += fn_loss(decoder_output, targ)\n",
    "        decoder_input = targ\n",
    "    \n",
    "    # 反向求导\n",
    "    loss.backward()\n",
    "    # 更新梯度\n",
    "    encoder_optimizer.step()\n",
    "    decoder_optimizer.step()\n",
    "    \n",
    "    print_loss = loss.data[0] / target_length\n",
    "    \n",
    "    if iter % config.print_epoch == 0:\n",
    "        print(\"loss is: %.4f\" % (print_loss))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 第六步：随机采样，对模型进行测试\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def sampling(encoder, decoder):\n",
    "    # 随机选择一个句子\n",
    "    pair = random.choice(pairs)\n",
    "    print('>', pair[0])\n",
    "    print('=', pair[1])\n",
    "    # 扔进模型中，进行翻译\n",
    "    input_variable = variableFromSentence(input_lang, pair[0], config.use_gpu)\n",
    "    input_length = input_variable.size()[0]\n",
    "    encoder_hidden = encoder.initHidden(config.use_gpu)\n",
    "    encoder_output, encoder_hidden = encoder(input_variable, encoder_hidden)\n",
    "    \n",
    "    decoder_input = Variable(torch.LongTensor([[SOS_token]]))\n",
    "    decoder_input = decoder_input.cuda() if config.use_gpu else decoder_input\n",
    "    decoder_hidden = encoder_hidden\n",
    "    \n",
    "    decoded_words = []\n",
    "    \n",
    "    for di in range(config.MAX_Len):\n",
    "        decoder_output, decoder_hidden = decoder(decoder_input, decoder_hidden)\n",
    "        topv, topi = decoder_output.data.topk(1)\n",
    "        ni = topi[0][0]\n",
    "        if ni == EOS_token:\n",
    "            decoded_words.append('<EOS>')\n",
    "            break\n",
    "        else:\n",
    "            decoded_words.append(output_lang.index2word[ni])\n",
    "        # 把当前的输出当做输入\n",
    "        decoder_input = Variable(torch.LongTensor([ni]))\n",
    "        decoder_input = decoder_input.cuda() if config.use_gpu else decoder_input\n",
    "        \n",
    "    # 对 decoded_words 进行连接，输出结果\n",
    "    output_sentence = ' '.join(decoded_words)\n",
    "    print('<', output_sentence)\n",
    "    print('')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "> i think maybe tom was right .\n",
      "= 我認為湯姆可能是對的。\n",
      "< 我 不 知 道 他 們 不 能 做 。 <EOS>\n",
      "\n",
      "> you can borrow my car anytime .\n",
      "= 你隨時可以借用我的車。\n",
      "< 你 不 能 再 做 了 。 <EOS>\n",
      "\n",
      "> it is clear what must be done .\n",
      "= 顯然地勢必要做些什麼。\n",
      "< 这 是 我 们 的 意 思 。 <EOS>\n",
      "\n",
      "> i made her angry .\n",
      "= 我讓她生氣。\n",
      "< 我 在 這 裡 一 個 小 時 。 <EOS>\n",
      "\n",
      "> it is difficult to play the piano .\n",
      "= 彈鋼琴很困難。\n",
      "< 这 是 一 个 好 的 人 。 <EOS>\n",
      "\n",
      "> i am curious .\n",
      "= 我很好奇。\n",
      "< 我 喜 欢 。 <EOS>\n",
      "\n",
      "> you d better go home .\n",
      "= 你最好回家。\n",
      "< 你 不 喜 欢 你 的 。 <EOS>\n",
      "\n",
      "> i hung my hat on the peg .\n",
      "= 我把我的帽子掛在掛鉤上。\n",
      "< 我 在 這 裡 一 個 小 時 的 工 作 。 <EOS>\n",
      "\n",
      "> i will pick you up around six .\n",
      "= 我會在六點鐘左右接你。\n",
      "< 我 想 要 你 的 意 思 。 <EOS>\n",
      "\n",
      "> she lived a lonely life .\n",
      "= 她的生活很寂寞。\n",
      "< 她 的 英 語 是 個 月 子 。 <EOS>\n",
      "\n"
     ]
    }
   ],
   "source": [
    "for i in range(10):\n",
    "    sampling(encoder, decoder)"
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {},
   "source": [
    "## 补充： 对不起，我训练出了一个智障。"
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
